Methods for identifying drug targets based on genomic sequence data

ABSTRACT

This invention provides a computational approach to identifying potential antibacterial drug targets based on a genome sequence and its annotation. Starting from a fully sequenced genome, open reading frame assignments are made which determine the metabolic genotype for the organism. The metabolic genotype, and more specifically its stoichiometric matrix, are analyzed using flux balance analysis to assess the effects of genetic deletions on the fitness of the organism and its ability to produce essential biomolecules required for growth.

RELATED APPLICATIONS

This application in a continuation of application Ser. No. 09/243/022,filed Feb. 2, 1999.

BACKGROUND OF THE INVENTION

1. Field of the Invention

This invention relates to methods for identifying drug targets based ongenomic sequence data. More specifically, this invention relates tosystems and methods for determining suitable molecular targets for thedirected development of antimicrobial agents.

2. Description of the Related Art

Infectious disease is on a rapid rise and threatens to regain its statusas a major health problem. Prior to the discovery of antibiotics in the1930s, infectious disease was a major cause of death. Furtherdiscoveries, development, and mass production of antibiotics throughoutthe 1940s and 1950s dramatically reduced deaths from microbialinfections to a level where they effectively no longer represented amajor threat in developed countries.

Over the years antibiotics have been liberally prescribed and the strongselection pressure that this represents has led to the emergence ofantibiotic resistant strains of many serious human pathogens. In somecases selected antibiotics, such as vancomycin, literally represent thelast line of defense against certain pathogenic bacteria such asStaphylococcus. The possibility for staphylococci to acquire vancomycinresistance through exchange of genetic material with enterococci, whichare commonly resistant to vanconycin, is a serious issue of concern tohealth care specialists. The pharmaceutical industry continues itssearch for new antimicrobial compounds, which is a lengthy and tedious,but very important process. The rate of development and introduction ofnew antibiotics appears to no longer be able to keep up with theevolution of new antibiotic resistant organisms. The rapid emergence ofantibiotic resistant ogranisms threatens to lead to a serious widespreadhealth care concern.

The basis of antimicrobial chemotherapy is to selectively kill themicrobe with minimal, and ideally no, harm to normal human cells andtissues. Therefore, ideal targets for antibacterial action arebiochemical processes that are unique to bacteria, or those that aresufficiently different from the corresponding mammalian process to allowacceptable discrimination between the two. For effective antibioticaction it is clear that a vital target must exist in the bacterial celland that the antibiotic be delivered to the target in an active form.Therefore resistance to an antibiotic can arise from: (i) chemicaldestruction or inactivation of the antibiotic; (ii) alteration of thetarget site to reduce or eliminate effective antibiotic binding; (iii)blocking antibiotic entry into the cell, or rapid removal from the cellafter entry; and (iv) replacing the metabolic step inhibited by theantibiotic.

Thus, it is time to fundamentally re-examine the philosophy of microbialkilling strategies and develop new paradigms. One such paradigm is aholistic view of cellular metabolism. The identification of “sensitive”metabolic steps in attaining the necessary metabolic flux distributionsto support growth and survival that can be attacked to weaken or destroya microbe, need not be localized to a single biochemical reaction orcellular process. Rather, different cellular targets that need not beintimately related in the metabolic topology could be chosen based onthe concerted effect the loss of each of these functions would have onmetabolism.

A similar strategy with viral infections has recently proved successful.It has been shown that “cocktails” of different drugs that targetdifferent biochemical processes provide enhanced success in fightingagainst HIV infection. Such a paradigm shift is possible only if thenecessary biological information as well as appropriate methods ofrational analysis are available. Recent advances in the field ofgenomics and bioinformatics, in addition to mathematical modeling, offerthe possibility to realize this approach.

At present, the field of microbial genetics is entering a new era wherethe genomes of several microorganisms are being completely sequenced. Itis expected that in a decade, or so, the nucleotide sequences of thegenomes of all the major human pathogens will be completely determined.The sequencing of the genomes of pathogens such as Haemophilusinfluenzae has allowed researchers to compare the homology of proteinsencoded by the open reading frames (ORFs) with those of Escherichiacoli, resulting in valuable insight into the H. influenzae metabolicfeatures. Similar analyses, such as those performed with H. influenzae,will provide details of metabolism spanning the hierarchy of metabolicregulation from bacterial genomes to phenotypes.

These developments provide exciting new opportunities to carry outconceptual experiments in silico to analyze different aspects ofmicrobial metabolism and its regulation. Further, the synthesis ofwhole-cell models is made possible. Such models can account for each andevery single metabolic reaction and thus enable the analysis of theirrole in overall cell function. To implement such analysis, however, amathematical modeling and simulation framework is needed which canincorporate the extensive metabolic detail but still retaincomputational tractability. Fortunately, rigorous and tractablemathematical methods have been developed for the required systemsanalysis of metabolism.

A mathematical approach that is well suited to account for genomicdetail and avoid reliance on kinetic complexity has been developed basedon well-known stoichiometry of metabolic reactions. This approach isbased on metabolic flux balancing in a metabolic steady state. Thehistory of flux balance models for metabolic analyses is relativelyshort. It has been applied to metabolic networks, and the study ofadipocyte metabolism. Acetate secretion from E. coli under ATPmaximization conditions and ethanol secretion by yeast have also beeninvestigated using this approach.

The complete sequencing of a bacterial genome and ORF assignmentprovides the information needed to determine the relevant metabolicreactions that constitute metabolism in a particular organism. Thus aflux-balance model can be formulated and several metabolic analyses canbe performed to extract metabolic characteristics for a particularorganism. The flux balance approach can be easily applied tosystematically simulate the effect of single, as well as multiple, genedeletions. This analysis will provide a list of sensitive enzymes thatcould be potential antimicrobial targets.

The need to consider a new paradigm for dealing with the emergingproblem of antibiotic resistant pathogens is a problem of vitalimportance. The route towards the design of new antimicrobial agentsmust proceed along directions that are different from those of the past.The rapid growth in bioinformatics has provided a wealth of biochemicaland genetic information that can be used to synthesize completerepresentations of cellular metabolism. These models can be analyzedwith relative computational ease through flux-balance models and visualcomputing techniques. the ability to analyze the global metabolicnetwork and understand the robustness and sensitivity of its regulationunder various growth conditions offers promise in developing novelmethods of antimicrobial chemotherapy.

In one example, Pramanik et al. described a stoichiometric model of E.coli metabolism using flux-balance modeling techniques (StoichiometricModel of Escherichia coli Metabolism: Incorporation of Growth-RateDependent Biomass Composition and Mechanistic Energy Requirement,Biotechnology and Bioengineering, Vol. 56, No. 4, Nov. 20, 1997),However, the analytical methods described by Pramanik, et al. can onlybe used for situations in which biochemical knowledge exists for thereactions occurring within an organism. Pramanik, et al. produced ametabolic model of metabolism for E. coli based on biochemicalinformation rather than genomic data since the metabolic genes andrelated reaction for E. coli had already been well studied andcharacterized. Thus, this method is inapplicable to determining ametabolic model for organisms for which little or no biochemicalinformation on metabolic enzymes and genes is known. It can beenvisioned that in the future the only information we may have regardingan emerging pathogen is its genomic sequence. What is needed in the artis a system and method for determining and analyzing the entiremetabolic network of ogranisms whose metabolic reactions have not yetbeen determined from biochemical assays. The present invention providessuch a system.

SUMMARY OF THE INVENTION

This invention relates to constructing metabolic genotypes and genomespecific stoichiometric matrices from genome annotation data. Thefunctions of the metabolic genes in the target organism are determinedby homology searches against data bases of genes from similar organisms.Once a potential function is assigned to each metabolic gene of thetarget organism, the resulting data is analyzed. In one embodiment, eachgene is subjected to a flux-balance analysis to assess the effects ofgenetic deletions on the ability of the target organism to produceessential biomolecules necessary for its growth. Thus, the inventionprovides a high-throughput computational method to screen for geneticdeletions which adversely affect the growth capabilities of fullysequenced organisms.

Embodiments of this invention also provide a computational, as opposedto an experimental, method for the rapid screening of genes and theirgene products as potential drug targets to inhibit an organism's growth.This invention utilizes the genome sequence, the annotation data, andthe biomass requirements of an organism to construct genomicallycomplete metabolic genotypes and genome-specific stoichiometricmatrices. These stoichiometric matrices are analyzed using aflux-balance analysis. This invention describes how to assess theaffects of genetic deletions on the fitness and productive capabilitiesof an organism under given environmental and genetic conditions.

Construction of a genome-specific stoichiometric matrix from genomicannotation data is illustrated along with applying flux-balance analysisto study the properties of the stoichiometric matrix, and hence themetabolic genotype of the organism. By limiting the constraints onvarious fluxes and altering the environmental inputs to the metabolicnetwork, genetic deletions may be analyzed for their affects on growth.This invention is embodied in a software application that can be used tocreate the stoichiometric matrix for a fully sequenced and annotatedgenome. Additionally, the software application can be used to furtheranalyze and manipulate the network so as to predict the ability of anorganism to produce biomolecules necessary for growth, thus essentiallysimulating a genetic deletion.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a flow diagram illustrating one procedure for creatingmetabolic genotypes from genomic sequence data for any organism.

FIG. 2 is a flow diagram illustrating one procedure for producing insilico microbial strains from the metabolic genotypes created by themethod of FIG. 1, along with additional biochemical and microbiologicaldata.

FIG. 3 is a graph illustrating a predicition of genome scale shifts intranscription. The graph shows the different phases of the metabolicresponse to varying oxygen availability, starting from completelyaerobic to completely anaerobic in E. coli. The predicted changes inexpression pattern between phases II and V are indicated.

DETAILED DESCRIPTION OF THE INVENTION

This invention relates to systems and methods for utilizing genomeannotation data to construct a stoichiometric matrix representing mostof all of the metabolic reactions that occur within an organism. Usingthese systems and methods, the properties of this matrix can be studiedunder conditions simulating genetic deletions in order to predict theaffect of a particular gene on the fitness of the organism. Moreover,genes that are vital to the growth of an organism can be found byselectively removing various genes from the stoichiometric matrix andthereafter analyzing whether an organism with this genetic makeup couldsurvive. Analysis of these lethal genetic mutations is useful foridentifying potential genetic targets for anti-microbial drugs.

It should be noted that the systems and methods described herein can beimplemented on any conventional host computer system, such as thosebased on Intel® microprocessors and running Microsoft Windows operatingsystems. Other systems, such as those using the UNIX or LINUX operatingsystem and based on IBM®, DEC® or Motorola® microprocessors are alsocontemplated. The systems and methods described herein can also beimplemented to run on client-server systems and wide-area networks, suchas the Internet.

Software to implement the system can be written in any well-knowncomputer language, such as Java, C, C++, Visual Basic, FORTRAN or COBOLand compiled using any well-known compatible compiler.

The software of the invention normally runs from instructions stored ina memory on the host computer system. Such a memory can be a hard disk,Random Access Memory, Read Only Memory and Flash Memory. Other types ofmemories are also contemplated to function within the scope of theinvention.

A process 10 for producing metabolic genotypes from an organism is shownin FIG. 1. Beginning at a start state 12, the process 10 then moves to astate 14 to obtain the genomic DNA sequence of an organism. Thenucleotide sequence of the genomic DNA can be rapidly determined for anorganism with a genome size on the order of a few million base pairs.One method for obtaining the nucleotide sequences in a genome is throughcommercial gene databases. Many gene sequences are available on-linethrough a number of sites (see, for example, www.tigr.org) and caneasily be downloaded from the Internet. Currently, there are 16microbial genomes that have been fully sequenced and are publiclyavailable, with countless others held in proprietary databases. It isexpected that a number of other organisms, including pathogenicorganisms will be found in nature for which little experimentalinformation, except for its genome sequence, will be available.

Once the nucleotide sequence of the entire genomic DNA in the targetorganism has been obtained at state 14, the coding regions, also knownas open reading frames, are determined at a state 16. Using existingcomputer algorithms, the location of open reading frames that encodegenes from within the genome can be determined. For example, to identifythe proper location, strand, and reading frame of an open reading frameone can perform a gene search by signal (promoters, ribosomal bindingsites, etc.) or by content (positional base frequencies, codonpreference). Computer programs for determining open reading frames areavailable, for example, by the University of Wisconsin Genetics ComputerGroup and the National Center for Biotechnology Information.

After the location of the open reading frames have been determined atstate 16, the process 10 moves to state 18 to assign a function to theprotein encoded by the open reading frame. The discovery that an openreading frame or gene has sequence homology to a gene coding for aprotein of known function, or family of proteins of known function, canprovide the first clues about the gene and it's related protein'sfunction. After the locations of the open reading frames have beendetermined in the genomic DNA from the target organism, well-establishedalgorithms (i.e. the Basic Local Alignment Search Tool (BLAST) and theFAST family of programs can be used to determine the extent ofsimilarity between a given sequence and gene/protein sequences depositedin worldwide genetic databases. If a coding region from a gene in thetarget organism is homologous to a gene within one of the sequencedatabases, the open reading frame is assigned a function similar to thehomologously matched gene. Thus, the functions of nearly the entire genecomplement or genotype of an organism can be determined so long ashomologous genes have already been discovered.

All of the genes involved in metabolic reactions and functions in a cellcomprise only a subset of the genotype. This subset of genes is referredto as the metabolic genotype of a particular organism. Thus, themetabolic genotype of an organism includes most or all of the genesinvolved in the organism's metabolism. The gene products produced fromthe set of metabolic genes in the metabolic genotype carry out all ormost of the enzymatic reactions and transport reactions known to occurwithin the target organism as determined from the genomic sequence.

To begin the selection of this subset of genes, one can simply searchthrough the list of functional gene assignments from state 18 to findgenes involved in cellular metabolism. This would include genes involvedin central metabolism, amino acid metabolism, nucleotide metabolism,fatty acid and lipid metabolism, carbohydrate assimilation, vitamin andcofactor biosynthesis, energy and redox generation, etc. This subset isgenerated at a state 20. The process 10 of determining metabolicgenotype of the target organism from genomic data then terminates at anend stage 22.

Referring now to FIG. 2, the process 50 of producing a computer model ofan organism. This process is also known as producing in silico microbialstrains. The process 50 begins at a start state 52 (same as end state 22of process 10) and then moves to a state 54 wherein biochemicalinformation is gathered for the reactions performed by each metabolicgene product for each of the genes in the metabolic genotype determinedfrom process 10.

For each gene in the metabolic genotype, the substrates and products, aswell as the stoichiometry of any and all reactions performed by the geneproduct of each gene can be determined by reference to the biochemicalliterature. This includes information regarding the irreversible orreversible nature of the reactions. The stoichiometry of each reactionprovides the molecular ratios in which reactants are converted intoproducts.

Potentially, there may still remain a few reactions in cellularmetabolism which are known to occur from in vitro assays andexperimental data. These would include well characterized reactions forwhich a gene or protein has yet to be identified, or was unindentifiedfrom the genome sequencing and functional assignment of state 14 and 18.This would also include the transport of metabolites into or out of thecell by uncharacterized genes related to transport. Thus one reason forthe missing gene information may be due to a lack of characterization ofthe actual gene that performs a known biochemical conversion. Thereforeupon careful review of existing biochemical literature and availableexperimental data, additional metabolic reactions can be added to thelist of metabolic reactions determined from the metabolic genotype fromstate 54 at a state 56. This would include information regarding thesubstrates, products, reversibility/irreversibility, and stoichiometryof the reactions.

All of the information obtained at states 54 and 56 regarding reactionsand their stoichiometry can be represented in a matrix format typicallyreferred to as a stoichiometric matrix. Each column in the matrixcorresponds to a given reaction or flux, and each row corresponds to thedifferent metabolites involved in the given reaction/flux. Reversiblereactions may either be represented as one reaction that operates inboth the forward and reverse direction or be decomposed into one forwardreaction and one backward reaction in which case all fluxes can onlytake on positive values. Thus, a given position in the matrix describesthe stoichiometric participation of a metabolite (listed in the givenrow) in a particular flux of interest (listed in the given column).Together all of the columns of the genome specific stoichiometric matrixrepresent all of the chemical conversions and cellular transportprocesses that are determined to be present in the organism. Thisincludes all internal fluxes and so called exchange fluxes operatingwithin the metabolic network. Thus, the process 50 moves to a state 58in order to formulate all of the cellular reactions together in a genomespecific stoichiometric matrix. The resulting genome specificstoichiometric matrix is a fundamental representation of a genomicallyand biochemically defined genotype.

After the genome specific stoichiometric matrix is defined at state 58,the metabolic demands placed on the organism are calculated. Themetabolic demands can be readily determined from the dry weightcomposition of the cell. In the case of well-studied organisms such asEscherichia coli and Bacillus subtilis, the dry weight composition isavailable in the published literature. However, in some cases it will benecessary to experimentally determine the dry weight composition of thecell for the organism in question. This can be accomplished with varydegrees of accuracy. The first attempt would measure the RNA, DNA,protein, and lipid fractions of the cell. A more detailed analysis wouldalso provide the specific fraction of nucleotides, amino acids, etc. Theprocess 50 moves to state 60 for the determination of the biomasscomposition of the target organism.

The process 50 then moves to state 62 to perform several experimentsthat determine the uptake rates and maintenance requirements for theorganism. Microbiological experiments can be carried out to determinethe uptake rates for many of the metabolites that are transported intothe cell. the uptake rate is determined by measuring the depletion ofthe substrate from the growth media. The measurement of the biomass ateach point is also required, in order to determine the uptake rate perunit biomass. The maintenance requirements can be determined from achemostat experiment. The glucose uptake rate is plotted versus thegrowth rate, and the y-intercept is interpreted as the non-growthassociated maintenance requirements. The growth associated maintenancerequirements are determined by fitting the model results to theexperimentally determined points in the growth rate versus glucoseuptake rate plot.

Next, the process 50 moves to a state 64 wherein information regardingthe metabolic demands and uptake rates obtained at state 62 are combinedwith the genome specific stoichiometric matrix of step 8 together fullydefine the metabolic system using flux balance analysis (FBA). This isan approach well suited to account for genomic detail as it has beendeveloped based on the well-known stoichiometry of metabolic reactions.The time constants characterizing metabolic transients and/or metabolicreactions are typically very rapid, on the order of milli-seconds toseconds, compared to the time constants of cell growth on the order ofhours to days. Thus, the transient mass balances can be simplified toonly consider the steady state behavior. Eliminating the timederivatives obtained from dynamic mass balances around every metabolitein the metabolic system, yields the system of linear equationsrepresented in matrix notation,

S*v=0  Equation 1

where S refers to the stoichiometric matrix of the system, and v is theflux vector. This equation simply states that over long times, theformation fluxes of a metabolite must be balanced by the degradationfluxes. Otherwise, significant amounts of the metabolite will accumulateinside the metabolic network. Applying equation 1 to out system we let Snow represent the genome specific stoichiometric matrix.

To determine the metabolic capabilities of a defined metabolic genotypeEquation 1 is solved for the metabolic fluxes and the internal metabolicreactions, v, which imposing constraints on the activity of thesefluxes. Typically the number of metabolic fluxes is greater than thenumber of mass balances (i.e., m>n) resulting in a plurality of feasibleflux distributions that satisfy Equation 1 and any constraints placed onthe fluxes of the system. This range of solutions is indicative of theflexibility in the flux distributions that can be achieved with a givenset of metabolic reactions. The solutions to Equation 1 lie in arestricted region. This subspace defines the capabilities of themetabolic genotype of a given organism, since the allowable solutionsthat satisfy Equation 1 and any constraints placed on the fluxes of thesystem define all the metabolic flux distributions that can be achievedwith a particular set of metabolic genes.

The particular utilization of the metabolic genotype can be defined asthe metabolic phenotype that is expressed under those particularconditions. Objectives for metabolic function can be chosen to explorethe ‘best’ use of the metabolic network within a given metabolicgenotype. The solution to equation 1 can be formulated as a linearprogramming problem, in which the flux distribution that minimizes aparticular objective if found. Mathematically, this optimization can bestated as;

Minimize Z  Equation 2

where Z=Σc _(i) .v _(i)=

c*v

  Equation 3

where Z is the objective which is represented as a linear combination ofmetabolic fluxes v_(i). The optimization can also be stated as theequivalent maximization problem; i.e. by changing the sign on Z.

This general representation of Z enables the formulation of a number ofdiverse objectives. These objectives can be design objectives for astrain, exploitation of the metabolic capabilities of a genotype, orphysiologically meaningful objective functions, such as maximum cellulargrowth. For this application, growth is to be defined in terms ofbiosynthetic requirements based on literature values of biomasscomposition or experimentally determined values such as those obtainedfrom state 60. Thus, we can define biomass generation as an additionalreaction flux draining intermediate metabolites in the appropriateratios and represented as an objective function Z. In addition todraining intermediate metabolites this reaction flux can be formed toutilize energy molecules such as ATP, NADH and NADPH so as toincorporate any maintenance requirement that must be met. This newreaction flux then becomes another constraint/balance equation that thesystem must satisfy as the objective function. It is analagous to addingan addition column to the stoichiometric matrix of Equation 1 torepresent such a flux to describe the production demands placed on themetabolic system. Setting this new flux as the objective function andasking the system to maximize the value of this flux for a given set ofconstraints on all the other fluxes is then a method to simulate thegrowth of the organism.

Using linear programming, additional constraints can be placed on thevalue of any of the fluxes in the metabolic network.

β_(j) ≦v _(j)≦α_(j)  Equation 4

These constraints could be representative of a maximum allowable fluxthrough a given reaction, possibly resulting from a limited amount of anenzyme present in which case the value for α_(j) would take on a finitevalue. These constraints could also be used to include the knowledge ofthe minimum flux through a certain metabolic reaction in which case thevalue for β_(j) would take on a finite value. Additionally, if onechooses to leave certain reversible reactions or transport fluxes tooperate in a forward and reverse manner the flux may remainunconstrained by setting β_(j) to negative infinity and α_(j) topositive infinity. If reactions proceed only in the forward reactionβ_(j) is set to zero while α_(j) is set to positive infinity. As anexample, to simulate the event of a genetic deletion the flux throughall of the corresponding metabolic reactions related to the gene inquestion are reduced to zero by setting β_(j) and α_(j) to be zero inEquation 4. Based on the in vivo environment where the bacteria livesone can determine the metabolic resources available to the cell forbiosynthesis of essentially molecules for biomass. Allowing thecorresponding transport fluxes to be active provides the in silicobacteria with inputs and outputs for substrates and by-products producesby the metabolic network. Therefore as an example, if one wished tosimulate the absence of a particular growth substrate one simplyconstrains the corresponding transport fluxes allowing the metabolite toenter the cell to be zero by allowing β_(j) and α_(j) to be zero inEquation 4. On the other hand if a substrate is only allowed to enter orexit the cell via transport mechanisms, the corresponding fluxes can beproperly constrained to reflect this scenario.

Together the linear programming representation of the genome-specificstoichiometric matrix as in Equation 1 along with any generalconstraints placed on the fluxes in the system, and any of the possibleobjective functions completes the formulation of the in silico bacterialstrain. The in silico strain can then be used to study theoreticalmetabolic capabilities by simulating any number of conditions andgenerating flux distributions through the use of linear programming. Theprocess 50 of formulating the in silico strain and simulating itsbehavior using linear programming techniques terminates at an end state66.

Thus, by adding or removing constraints on various fluxes in the networkit is possible to (1) simulate a genetic deletion event and (2) simulateor accurately provide the network with the metabolic resources presentin its in vivo environment. Using flux balance analysis it is possibleto determine the affects of the removal or addition of particular genesand their associated reactions to the composition of the metabolicgenotype on the range of possible metabolic phenotypes. If theremoval/deletion does not allow the metabolic network to producenecessary precursors for growth, and the cell can not obtain theseprecursors from its environment, the deletion(s) has the potential as anantimicrobial drug target. Thus by adjusting the constraints anddefining the objective function we can explore the capabilities of themetabolic genotype using linear programming to optimize the fluxdistribution through the metabolic network. This creates what we willrefer to as an in silico bacterial strain capable of being studied andmanipulated to analyze, interpret, and predict the genotype-phenotyperelationship. It can be applied to assess the affects of incrementalchanges in the genotype or changing environmental conditions, andprovide a tool for computer aided experimental design. It should berealized that other types of organisms can similarly be represented insilico and still be within the scope of the invention.

The construction of a genome specific stoichiometric matrix and insilico microbial strains can also be applied to the area of signaltransduction. The components of signaling networks can be identifiedwithin a genome and used to construct a content matrix that can befurther analyzed using various techniques to be determined in thefuture.

A. Example 1: E. coli Metabolic Genotype and in silico Model

Using the methods disclosed in FIGS. 1 and 2, an in silico strain ofEscherichia coli K-12 has been constructed and represents the first suchstrain of a bacteria largely generated from annotated sequence data andfrom biochemical information. The genetic sequence and open readingframe identifications and assignments are readily available from anumber of on-line locations (ex: www.tigr.org). For this example weobtained the annotated sequence from the following website for the E.coli Genome Project at the University of Wisconsin(http//www.genetics.wisc.edu). Details regarding the actual sequencingand annotation of the sequence can be found at that site. From thegenome annotation data the subject of genes involved in cellularmetabolism was determined as described above in FIG. 1, state 20,comprising the metabolic genotype of the particular strain of E. coli.

Through detailed analysis of the published biochemical literature on E.coli we determined (1) all of the reactions associated with the genes inthe metabolic genotype and (2) any additional reactions known to occurfrom biochemical data which were not represented by the genes in themetabolic genotype. This provided all of the necessary information toconstruct the genome specific stoichiometric matrix for E. coli K-12.

Briefly, the E. coli K-12 bacterial metabolic genotype and morespecifically the genome specific stoichiometric matrix contains 731metabolic processes that influence 436 metabolites (dimensions of thegenome specific stoichiometric matrix are 436×731). There are 80reactions present in the genome specific stoichiometric matrix that donot have a genetic assignment in the annotated genome, but are known tobe present from biochemical data. The genes contained within thismetabolic genotype are shown in Table 1 along with the correspondingreactions they carry out.

Because E. coli is arguably the best studied organism, it was possibleto determine the uptake rates and maintenance requirements (state 62 ofFIG. 2) by reference to the published literature. This in silico strainaccounts for the metabolic capabilities of E. coli. It includes membranetransport processes, the central catabolic pathways, utilization ofalternative carbon sources and the biosynthetic pathways that generateall the components of the biomass. In the case of E. coli K-12, we cancall upon the wealth of data on overall metabolic behavior and detailedbiochemical information about the in vivo genotype to which we cancompare the behavior of the in silico strain. One utility of FBA is theability to learn about the physiology of the particular organism andexplore its metabolic capabilities without any specific biochemicaldata. This ability is important considering possible future scenarios inwhich the only data that we may have for a newly discovered bacterium(perhaps pathogenic) could be its genome sequence.

B. Example 2: in silico Deletion Analysis for E. coli to FindAntimicrobial Targets

Using the in silico strain constructed in Example 1, the effect ofindividual deletions of all the enzymes in central metabolism can beexamined in silico. For the analysis to determine sensitive linkages inthe metabolic network of E. coli, the objective function utilized is themaximization of the biomass yield. This is defined as a flux drainingthe necessary biosynthetic precursors in the appropriate ratios. Thisflux is defined as the biomass composition, which can be determined fromthe literature. See Neidhardt et. al., Escherichia coli and Salmonella:Cellular and Molecular Biology, Second Edition, ASM Press, Washington,D.C., 1996. Thus, the objective function is the maximization of a singleflux, this biosynthetic flux.

Constraints are placed on the network to account for the availability ofsubstrates for the growth of E. coli. In the initial deletion analysis,growth was simulated in an aerobic glucose minimal media culture.Therefore, the constraints are set to allow for the components includedin the media to be taken up. The specific uptake rate can be included ifthe value is known, otherwise, an unlimited supply can be provided. Theuptake rate of glucose and oxygen have been determined for E. coli(Neidhardt et. al., Escherichia coli and Salmonella: Cellular andMolecular Biology, Second Edition, ASM Press, Washington, D.C., 1996.Therefore, these values are included in the analysis. The uptake ratefor phosphate, sulfur, and nitrogen source is not precisely known, soconstraints on the fluxes for the uptake of these important substratesis not included, and the metabolic network is allowed to take up anyrequired amount of these substrates.

The results showed that a high degree of redundancy exists in centralintermediary metabolism during growth in glucose minimal media, which isrelated to the interconnectivity of the metabolic reactions. Only a fewmetabolic functions were found to be essential such that their lossremoves the capability of cellular growth on glucose. For growth onglucose, the essential gene products are involved in the 3-carbon stateof glycolysis, three reactions of the TCA cycle, and several pointswithin the PPP. Deletions in the 6-carbon stage of glycolysis result ina reduced ability to support growth due to the diversion of additionalflux through the PPP.

The results from the gene deletion study can be directly compared withgrowth data from mutants. The growth characteristics of a series of E.coli mutants on several different carbon sources were examined (80 caseswere determined from the literature), and compared to the in silicodeletion results (Table 2). The majority (73 of 80 cases or 91%) of themutant experimental observations are consistent with the predictions ofthe in silico study. The results from the in silico gene deletionanalysis are thus consistent with experimental observations.

C. Example 3: Prediction of Genome Scale Shifts in Gene Expression

Flux based analysis can be used to predict metabolic phenotypes underdifferent growth conditions, such as substrate and oxygen availability.The relation between the flux value and the gene expression levels isnon-linear, resulting in bifurcations and multiple steady states.However, FBA can give qualitative (on/off) information as well as therelative importance of gene products under a given condition. Based onthe magnitude of the metabolic fluxes, qualitative assessment of geneexpression can be inferred.

FIG. 3 a shows the five phases of distinct metabolic behavior of E. coliin response to varying oxygen availability, going from completelyanaerobic (phase I) to completely aerobic (phase V). FIGS. 3 b and 3 cdisplay lists of the genes that are predicted to be induced or repressedupon the shift from aerobic growth (phase V) to nearly completeanaerobic growth (phase II). The numerical values shown in FIGS. 3 b and3 c are the fold change in the magnitude of the fluxes calculated foreach of the listed enzymes.

For this example, the objective of maximization of biomass yield isutilized (as described above). The constraints on the system are alsoset accordingly (as described above). However, in this example, a changein the availability of a key substrate is leading to changes in themetabolic behavior. The change in the parameter is reflected as a changein the uptake flux. Therefore, the maximal allowable oxygen uptake rateis changed to generate this data. The figure demonstrates how severalfluxes in the metabolic network will change as the oxygen uptake flux iscontinuously decreased. Therefore, the constraints on the fluxes isidentical to what is described in the previous section, however, theoxygen uptake rate is set to coincide with the point in the diagram.

Corresponding experimental data sets are now becoming available. Usinghigh-density oligonucleotide arrays the expression levels of nearlyevery gene in Saccharomyces cerevisiae can now be analyzed under variousgrowth conditions. From these studies it was shown that nearly 90% ofall yeast mRNA are present in growth on rich and minimal media, while alarge number of mRNAs were shown to be differentially expressed underthese two conditions. Another recent article shows how the metabolic andgenetic control of gene expression can be studied on a genomic scaleusing DNA microarray technology (Exploring the Metabolic and GeneticControl of Gene Expression o a Genomic Scale, Science, Vol. 278, Oct.24, 1997. The temporal changes in genetic expression profiles that occurduring the diauxic shift in S. cerevisiae were observed for every knownexpressed sequence tag (EST) in this genome. As shown above, FBA can beused to qualitatively simulate shifts in metabolic genotype expressionpatterns due to alterations in growth environments. Thus, FBA can serveto complement current studies in metabolic gene expression, by providinga fundamental approach to analyze, interpret, and predict the data fromsuch experiments.

D. Example 4: Design of Defined Media

An important economic consideration in large-scale bioprocesses isoptimal medium formulation. FBA can be used to design such media.Following the approach defined above, a flux-balance model for the firstcompletely sequenced free living organism, Haemophilus influenzae, hasbeen generated. One application of this model is to predict a minimaldefined media. It was found that H. influenzae can grow on the minimaldefined medium as determined from the ORF assignments and predictedusing FBA. Simulated bacterial growth was predicted using the followingdefined media: fructose, arginine, cysteine, glutamate, putrescine,spermidine, thiamin, NAD, tetrapyrrole, pantothenate, ammonia,phosphate. This predicted minimal medium was compared to the previouslypublished defined media and was found to differ in only one compound,inosine. It is known that inosine is not required for growth, however itdoes serve to enhance growth. Again the in silico results obtained wereconsistent with published in vivo research. These results provideconfidence in the use of this type of approach for the design of definedmedia for organisms in which there currently does not exist a definedmedia.

While particular embodiments of the invention have been described indetail, it will be apparent to those skilled in the art that theseembodiments are exemplary rather than limiting, and the true scope ofthe invention is defined by the claims that follow.

TABLE 1 The genes included in the E. coli metabolic genotype along withcorresponding enzymes and reactions that comprise the genome specificstoichiometric matrix. The final column indicates the presence/absenceof the gene (as the number of copies) in the E. coli genome. Thus thepresence of a gene in the E. coli genome indicates that the gene is partof the metabolic genotype. Reactions/Genes not present in the genome arethose gathered at state 56 in FIG. 2 and together with the reactions ofthe genes in the metabolic genotype form the columns of the genomespecific stoichiometric matrix. E. coli Enzyme Gene Reaction GenomeGlucokinase glk GLC + ATP −> G6P + ADP 1 Glucokinase glk bDGLC + ATP −>bDG6P + ADP 1 Phosphoglucose isomerase pgi G6P <−> F6P 1 Phosphoglucoseisomerase pgi bDG6P <−> G6P 1 Phosphoglucose isomerase pgi bDG6P <−> F6P1 Aldose 1-epimerase galM bDGLC <−> GLC 1 Glucose-1-phophatase agp G1P−> GLC + PI 1 Phosphofructokinase pfkA F6P + ATP −> FDP + ADP 1Phosphofructokinase B pfkB F6P + ATP −> FDP + ADP 1Fructose-1,6-bisphosphatase fbp FDP −> F6P + PI 1Fructose-1,6-bisphosphatate aldolase fba FDP <−> T3P1 + T3P2 2Triosphosphate Isomerase tpiA T3P1 <−> T3P2 1 Methylglyoxal synthasemgsA T3P2 −> MTHGXL + PI 0 Glyceraldehyde-3-phosphate dehydrogenase-Acomplex gapA T3P1 + PI + NAD <−> NADH + 13PDG 1Glyceraldehyde-3-phosphate dehydrogenase-C complex gapC1C2 T3P1 + PI +NAD <−> NADH + 13PDG 2 Phosphoglycerate kinase pgk 13PDG + ADP<−> 3PG +ATP 1 Phosphoglycerate mutase 1 gpmA 3PG <−> 2PG 1 Phosphoglyceratemutase 2 gpmB 3PG <−> 2PG 1 Enolase eno 2PG <−> PEP 1Phosphoenolpyruvate synthase ppsA PYR + ATP −> PEP + AMP + PI 1 PyruvateKinase II pykA PEP + ADP −> PYR + ATP 1 Pyruvate Kinase I pykF PEP + ADP−> PYR + ATP 1 Pyruvate dehydrogenase lpdA, aceEF PYR + COA + NAD −>NADH + CO2 + ACCOA 3 Glucose-1-phosphate adenylytransferase glgC ATP +G1P −> ADPGLC + PPI 1 Glycogen synthase glgA ADPGLC −> ADP + GLYCOGEN 1Glycogen phosphorylase glgP GLYCOGEN + PI −> G1P 1 Maltodextrinphosphorylase malP GLYCOGEN + Pl −> G1P 1 Glucose6-phosphate-1-dehydrogenase zwf G6P + NADP <−> D6PGL + NADPH 16-Phosphogluconolactonase pgl D6PGL −> D6PGC 0 6-Phosphogluconatedehydrogenase (decarboxylating) gnd D6PGC + NADP −> NADPH + CO2 + RL5P 1Ribose-5-phosphate isomerase A rpiA RL5P <−> R5P 1 Ribose-5-phosphateisomerase B rpiB RL5P <−> R5P 1 Ribulose phosphate 3-epimerase rpe RL5P<−> X5P 1 Transketolase I tktA R5P + X5P <−> T3P1 + S7P 1 TransketolaseII tktB R5P + X5P <−> T3P1 + S7P 1 Transketolase I tktA X5P + E4P <−>F6P + T3P1 1 Transketolase II tktB X5P + E4P <−> F6P + T3P1 1Transaldolase B talB T3P1 + S7P <−> E4P + F6P 1 Phosphogluconatedehydratase edd D6PGC −> 2KD6PG 1 2-Keto-3-deoxy-6-phosphogluconatealdolase eda 2KD6PG −> T3P1 + PYR 1 Citrate synthase gltA ACCOA + OA −>COA + CIT 1 Aconitase A acnA CIT <−> ICIT 1 Aconitase B acnB CIT <−>ICIT 1 Isocitrate dehydrogenase icdA ICIT + NADP <−> CO2 + NADPH + AKG 12-Ketoglutarate dehyrogenase sucAB, lpdA AKG + NAD + COA −> CO2 + NADH +SUCCOA 3 Succinyl-CoA synthetase sucCD SUCCOA + ADP + PI <−> ATP + COA +SUCC 2 Succinate dehydrogenase sdhABCD SUCC + FAD −> FADH + FUM 4Fumurate reductase frdABCD FUM + FADH −> SUCC + FAD 4 Fumarase A fumAFUM <−> MAL 1 Fumarase B fumB FUM <−> MAL 1 Fumarase C fumC FUM <−> MAL1 Malate dehydrogenase mdh MAL + NAD <−> NADH + OA 1 D-Lactatedehydrogenase 1 dld PYR + NADH <−> NAD + LAC 1 D-Lactate dehydrogenase 2ldhA PYR + NADH <−> NAD + LAC 1 Acetaldehyde dehydrogenase adhE ACCOA +2 NADH <−> ETH + 2 NAD + COA 1 Pyruvate formate lyase 1 pflAB PYR + COA−> ACCOA + FOR 2 Pyruvate formate lyase 2 pflCD PYR + COA −> ACCOA + FOR2 Formate hydrogen lyase fdhF, hycBEFG FOR −> CO2 5Phosphotransacetylase pta ACCOA + PI <−> ACTP + COA 1 Acetate kinase AackA ACTP + ADP <−> ATP + AC 1 GAR transformylase T purT ACTP + ADP <−>ATP + AC 1 Acetyl-CoA synthetase acs ATP + AC + COA −> AMP + PPI + ACCOA1 Phosphoenolpyruvate carboxykinase pckA OA + ATP −> PEP + CO2 + ADP 1Phosphoenolpyruvate carboxylase ppc PEP + CO2 −> OA + P1 1 Malic enzyme(NADP) maeB MAL + NADP −> CO2 + NADPH + PYR 0 Malic enzyme (NAD) sfcAMAL + NAD −> CO2 + NADH + PYR 1 Isocitrate lyase aceA ICIT −> GLX + SUCC1 Malate synthase A aceB ACCOA + GLX −> COA + MAL 1 Malate synthase GglcB ACCOA + GLX −> COA + MAL 1 Inorganic pyrophosphatase ppa PPI −> 2PI 1 NAPIT dehydrogenase II ndh NADH + Q −> NAD + QH2 1 NADHdehydrogenase I nuoABEFGHIJ NADH + Q −> NAD + QH2 + 3.5 HEXT 1 Formatedehydrogenase-N fdnGHI FOR + Q −> QH2 + CO2 + 2 HEXT 3 Formatedehydrogenase-O fdoIHG FOR + Q −> QH2 + CO2 + 2 HEXT 3 Formatedehydrogenase fdhF FOR + Q −> QH2 + CO2 + 2 HEXT 1 Pyruvate oxidase poxBPYR + Q −> AC + CO2 + QH2 1 Glycerol-3-phosphate dehydrogenase (aerobic)glpD GL3P + Q −> T3P2 + QH2 1 Glycerol-3-phosphate dehydrogenase(anaerobic) glpABC GL3P + Q −> T3P2 + QH2 3 Cytochrome oxidase bo3cyoABCD, cyc QH2 + .5 O2 −> Q + 2.5 HEXT 6 Cytochrome oxidase bdcydABCD, app QH2 + .5 O2 −> Q + 2 HEXT 6 Succinate dehydrogenase complexsdhABCD FADH + Q <−> FAD + QH2 4 Thioredoxin reductase trxB OTHIO +NADPH −> NADP + RTHIO 1 Pyridine nucleotide transhydrogenase pntABNADPH + NAD −> NADP + NADH 2 Pyridine nucleotide transhydrogenase pntABNADP + NADH + 2 HEXT −> NADPH + NAD 2 Hydrogenase 1 hyaABC 2 Q + 2 HEXT<−> 2 QH2 + H2 3 Hydrogenase 2 hybAC 2 Q + 2 HEXT <−> 2 QH2 + H2 2Hydrogenase 3 hycFGBE 2 Q + 2 HEXT <−> 2 QH2 + H2 4 F0F1-ATPaseatpABCDEFG ATP <−> ADP + PI + 4 HEXT 9 Alpha-galactosidase (melibiase)melA MELI −> GLC + GLAC 1 Galactokinase galK GLAC + ATP −> GAL1P + ADP 1Galactose-1-phosphate uridylyltransferase galT GAL1P + UDPG <−> G1P +UDPGAL 1 UDP-glucose 4-epimerase galE UDPGAL <−> UDPG 1UDP-glucose-1-phosphate uridylyltransferase galU G1P + UTP <−> UDPG +PPI 1 Phosphoglucomutase pgm G1P <−> G6P 1 Periplasmic beta-glucosidaseprecursor bglX LCTS −> GLC + GLAC 1 Beta-galactosidase (LACTase) lacZLCTS −> GLC + GLAC 1 trehalose-6-phosphate hydrolase treC TRE6P −>bDG6P + GLC 1 Beta-fructofuranosidase SUC6P −> G6P + FRU 01-Phosphofructokinase (Fructose 1-phosphate kinase) fruK F1P + ATP −>FDP + ADP 1 Xylose isomerase xylA FRU −> GLC 1 Phosphomannomutase cpsGMAN6P <−> MAN1P 1 Mannose-6-phosphate isomerase manA MACN1P <−> F6P 1N-Acetylglucosamine-6-phosphate deacetylase nagA NAGP −> GA6P + AC 1Glucosamine-6-phosphate deaminase nagB GA6P −> F6P + NH3 1N-Acetylneuraminate lyase nanA SLA −> PYR + NAMAN 1 L-Fucose isomerasefucI FUC <−> FCL 1 L-Fuculokinase fucK FCL + ATP −> FCL1P + ADP 1L-Fuculose phosphate aldolase fucA FCL1P <−> LACAL + T3P2 1 Lactaldehydereductase fucO LACAL + NADH <−> 12PPD + NAD 1 Aldehyde dehydrogenase AaldA LACAL + NAD <−> LLAC + NADH 1 Aldehyde dehydrogenase B aldB LACAL +NAD <−> LLAC + NADH 1 Aldehyde dehydrogenase adhC LACAL + NAD <−> LLAC +NADH 1 Aldehyde dehydrogenase adhC GLAL + NADH <−> GL + NADH 1 Aldehydedehydrogenase adhE LACAL + NAD −> LLAC + NADH 1 Aldehyde dehydrogenasealdH LACAL + NAD <−> LLAC + NADH 1 Aldehyde dehydrogenase aldH ACAL +NAD −> AC + NADH 1 Gluconokinase I gntV GLCN + ATP −> D6PGC + ADP 1Gluconokinase II gntK GLCN + ATP −> D6PGC + ADP 1 L-Rhamnose isomeraserhaA RMN <−> RML 1 Rhamnulokinase rhaB RML + ATP −> RML1P + ADP 1Rhamnulose-1-phosphate aldolase rhaD RML1P <−> LACAL + T3P2 1L-Arabinose isomerase araA ARAB <−> RBL 1 Arabinose-5-phospliateisomerase RL5P <−> A5P 0 L-Ribulokinase araB RBL + ATP −> RL5P + ADP 1L-Ribulose-phosphate 4-epimerase araD RL5P <−> X5P 1 Xylose isomerasexylA XYL <−> XUL 1 Xylulokinase xylB XUL + ATP −> X5P + ADP 1 RibokinaserbsK RIB + ATP −> R5P + ADP 1 Mannitol-1-phosphate 5-dehydrogenase mtlDMNT6P + NAD <−> F6P + NADH 1 Glucitol-6-phosphate dehydrogenase srlDGLT6P + NAD <−> F6P + NADH 1 Galactitol-1-phosphate dehydrogenase gatDGLTL1P + NAD <−> TAG6P + NADH 1 Phosphofructokinase B pfkB TAG6P + ATP−> TAG16P + ADP 1 1-Phosphofructokinase fruK TAG6P + ATP −> TAG16P + ADP1 Tagatose-6-phosphate kinase agaZ TAG6P + ATP −> TAG16P + ADP 1Tagatose-bisphosphate aldolase 2 gatY TAG16P <−> T3P2 + T3P1 1Tagatose-bisphosphate aldolase 1 agaY TAG16P <−> T3P2 + T3P1 1 Glycerolkinase glpK GL + ATP −> GL3P + ADP 1Glycerol-3-phosphate-dehydrogenase-[NAD(P)+] gpsA GL3P + NADP <−> T3P2 +NADPH 1 Phosphopentomutase deoB DR1P <−> DR5P 1 Phosphopentomutase deoBR1P <−> R5P 1 Deoxyribose-phosphate aldolase deoC DR5P −> ACAL + T3P1 1Asparate transaminase aspC OA + GLU <−> ASP + AKG 1 Asparaginesynthetase (Glutamate dependent) asnB ASP + ATP + GLN −> GLU + ASN +AMP + PPI 1 Aspartate-ammonia ligase asnA ASP + ATP + NH3 −> ASN + AMP +PPI 1 Glutamate dehydrogenase gdhA AKG + NH3 + NADPH <−> GLU + NADP 1Glutamate-ammonia ligase glnA GLU + NH3 + ATP −> GLN + ADP + PI 1Glutamate synthase gltBD AKG + GLN + NADPH −> NADP + 2 GLU 2 Alaninetransaminase alaB PYR + GLU <−> AKG + ALA 0 Valine-pyruvateaminotransferase avtA OIVAL + ALA −> PYR + VAL 1 Alanine racemase,biosynthetic alr ALA <−> DALA 1 Alanine racemase, catabolic dadX ALA −>DALA 1 N-Acetylglutamate synthase argA GLU + ACCOA −> COA + NAGLU 1N-Acetylglutamate kinase argB NAGLU + ATP −> ADP + NAGLUYP 1N-Acetylglutamate phosphate reductase argC NAGLUYP + NADPH <−> NADP +PI + NAGLUSAL 1 Acetylornithine transaminase argD NAGLUSAL + GLU <−>AKG + NAARON 1 Acetylornithine deacetylase argE NAARON −> AC + ORN 1Carbamoyl phosphate synthetase carAB GLN + 2 ATP + CO2 −> GLU + CAP + 2ADP + PI 2 Ornithine carbamoyl transferase 1 argF ORN + CAP <−> CITR +PI 2 Ornithine carbamoyl transferase 2 argI ORN + CAP <−> CITR + PI 1Ornithine transaminase ygjGH ORN + AKG −> GLUGSAL + GLU 2Argininosuccinate synthase argG CITR + ASP + ATP −> AMP + PPI + ARGSUCC1 Argininosuccinate lyase argH ARGSUCC <−> FUM + ARG 1 Argininedecarboxylase, biosynthetic speA ARG −> CO2 + AGM 1 Argininedecarboxylase, degradative adi ARG −> CO2 + AGM 1 Agmatinase speB AGM −>UREA + PTRC 1 Ornithine decarboxylase, biosynthetic speC ORN −> PTRC +CO2 1 Ornithine decarboxylase, degradative speF ORN −> PTRC + CO2 1Adenosylmethionine decarboxylase speD SAM <−> DSAM + CO2 1 Spermidinesynthase speE PTRC + DSAM −> SPMD + 5MTA 1 Methylthioadenosinenucleosidase 5MTA −> AD + 5MTR 0 5-Methylthioribose kinase 5MTR + ATP −>5MTRP + ADP 0 5-Methylthioribose-1-phosphate isomerase 5MTRP <−> 5MTR1P0 E-1 (Enolase-phosphatase) 5MTR1P −> DKMPP 0 E-3 (Unknown) DKMPP −>FOR + KMB 0 Transamination (Unknown) KMB + GLN −> GLU + MET 0 γ-Glutamylkinase proB GLU + ATP −> ADP + GLUP 1 Glutamate-5-semialdehydedehydrogenase proA GLUP + NADPH −> NADP + PI + GLUGSAL 1N-Acetylornithine deacetylase argE NAGLUSAL −> GLUGSAL + AC 1Pyrroline-5-carboxylate reductase proC GLUGSAL + NADPH −> PRO + NADP 1Threonine dehydratase, biosynthetic ilvA THR −> NH3 + OBUT 1 Threoninedehydratase, catabolic tdcB THR −> NH3 + OBUT 1 Acetohydroxybutanoatesynthase I ilvBN OBUT + PYR −> ABUT + CO2 2 Acetohydroxybutanoatesynthase II ilvG(12)M OBUT + PYR −> ABUT + CO2 3 Acetohydroxybutanoatesynthase III ilvIH OBUT + PYR −> ABUT + CO2 2 Acetohydroxy Acidisomeroreductase ilvC ABUT + NADPH −> NADP + DHMVA 1 Dihydroxy aciddehydratase ilvD DHMVA −> OMVAL 1 Branched chain amino acidaminotransferase ilvE OMVAL + GLU <−> AKG + ILE 1 Acetolactate synthaseI ilvBN 2 PYR −> CO2 + ACLAC 2 Acetolactate synthase II ilvG(12)M 2 PYR−> CO2 + ACLAC 3 Acetolactate synthase III ilvIH 2 PYR −> CO2 + ACLAC 2Acetohydroxy acid isomeroreductase iIvC ACLAC + NADPH −> NADP + DHVAL 1Dihydroxy acid dehydratase ilvD DHVAL −> OIVAL 1 Branched chain aminoacid aminotransferase ilvE OIVAL + GLU −> AKG + VAL 1 Valine-pyruvateaminotransferase avtA OIVAL + ALA −> PYR + VAL 1 Isopropylmalatesynthase leuA ACCOA + OLVAL −> COA + CBHCAP 1 Isopropylmalate isomeraseleuCD CBHCAP <−> IPPMAL 2 3-Isopropylmalate dehydrogenase leuB IPPMAL +NAD −> NADH + OICAP + CO2 1 Branched chain amino acid aminotransferaseilvE OICAP + GLU −> AKG + LEU 1 Aromatic amino acid transaminase tyrBOICAP + GLU −> AKG + LEU 1 2-Dehydro-3-deoxyphosphoheptonate aldolase FaroF E4P + PEP −> PI + 3DDAH7P 1 2-Dehydro-3-deoxyphosphoheptonatealdolase G aroG E4P + PEP −> PI + 3DDAH7P 12-Dehydro-3-deoxyphosphoheptonate aldolase H aroH E4P + PEP −> PI +3DDAH7P 1 3-Dehydroquinate synthase aroB 3DDAH7P −> DQT + PI 13-Dehydroquinate dehydratase aroD DQT <−> DHSK 1 Shikimate dehydrogenasearoE DHSK + NADPH <−> SME + NADP 1 Shikimate kinase I aroK SME + ATP −>ADP + SME5P 1 Shikimate kinase II aroL SME + ATP −> ADP + SME5P 13-Phosphoshikimate-1-carboxyvinyltransferase aroA SME5P + PEP <−>3PSME + PI 1 Chorismate synthase aroC 3PSME −> PI + CHOR 1 Chorismatemutase 1 pheA CHOR −> PHEN 1 Prephenate dehydratase pheA PHEN −> CO2 +PHPYR 1 Aromatic amino acid transaminase tyrB PHPYR + GLU <−> AKG + PHE1 Chorismate mutase 2 tyrA CHOR −> PHEN 1 Prephanate dehydrogenase tyrAPHEN + NAD −> HPHPYR + CO2 + NADH 1 Aromatic amino acid transaminasetyrB HPHPYR + GLU <−> AKG + TYR 1 Asparate transaminase aspC HPHPYR +GLU <−> AKG + TYR 1 Anthranilate synthase trpDE CHOR + GLN −> GLU +PYR + AN 2 Anthranilate synthase component II trpD AN + PRPP −> PPI +NPRAN 1 Phosphoribosyl anthranilate isomerase trpC NPRAN −> CPAD5P 1Indoleglycerol phosphate synthase trpC CPAD5P −> CO2 + IGP 1 Tryptophansynthase trpAB IGP + SER −> T3P1 + TRP 2 Pliosphoribosyl pyrophosphatesynthase prsA R5P + ATP <−> PRPP + AMP 1 ATP phosphoribosyltransferasehisG PRPP + ATP −> PPI + PRBATP 1 Phosphoribosyl-ATP pyrophosphatasehisIE PRBATP −> PPI + PRBAMP 1 Phosphoribosyl-AMP cyclohydrolase hisIEPRBAMP −> PRFP 1 Phosphoribosylformimino-5-amino-1-phos- hisA PRFP −>PRLP 1 phoribosyl-4-imidazole c Imidazoleglycerol phosphate synthasehisFH PRLP + GLN −> GLU + AICAR + DIMGP 2 Imidazoleglycerol phosphatedehydratase hisB DIMGP −> IMACP 1 L-Histidinol phosphateaminotransferase hisC IMACP + GLU −> AKG + HISOLP 1 Histidinolphosphatase hisB HISOLP −> PI + HISOL 1 Histidinol dehydrogenase hisDHISOL + 3 NAD −> HIS + 3 NADH 1 3-Phosphoglycerate dehydrogenase serA3PG + NAD −> NADH + PHP 1 Phosphoserine transaminase serC PHP + GLU −>AKG + 3PSER 1 Phosphoserine phosphatase serB 3PSER −> PI + SER 1 Glycinehydroxymethyltransferase glyA THF + SER −> GLY + METTHF 1 Threoninedehydrogenase tdh THR + COA −> GLY + ACCOA 1 Amino ketobutyrate CoAligase kbl THR + COA −> GLY + ACCOA 1 Sulfate adenylyltransferase cysDNSLF + ATP + GTP −> PPI + APS + GDP + PI 2 Adenylylsulfate kinase cysCAPS + ATP −> ADP + PAPS 1 3′-Phospho-adenylylsulfate reductase cysHPAPS + RTHIO −> OTHIO + H2SO3 + PAP 1 Sulfite reductase cysIJ H2SO3 +3NADPH <−> H2S + 3 NADP 2 Serine transacetylase cysE SER + ACCOA <−>COA + ASER 1 O-Acetylserine (thiol)-lyase A cysK ASER + H2S −> AC + CYS1 O-Acetylserine (thiol)-lyase B cysM ASER + H2S −> AC + CYS 1 3′-5′Bisphosphate nucleotidase PAP −> AMP + PI 0 Aspartate kinase I thrAASP + ATP <−> ADP + BASP 1 Aspartate kinase II metL ASP + ATP <−> ADP +BASP 1 Aspartate kinase III lysC ASP + ATP <−> ADP + BASP 1 Aspartatesemialdehyde dehydrogenase asd BASP + NADPH <−> NADP + PI + ASPSA 1Homoserine dehydrogenase I thrA ASPSA + NADPH <−> NADP + HSER 1Homoserine dehydrogenase II metL ASPSA + NADPH <−> NADP + HSER 1Homoserine kinase thrB HSER + ATP −> ADP + PHSER 1 Threonine synthasethrC PHSER −> PI + THR 1 Dihydrodipicolinate synthase dapA ASPSA + PYR−> D23PIC 1 Dihydrodipicolinate reductase dapB D23PIC + NADPH −> NADP +PIP26DX 1 Tetrahydrodipicolinate succinylase dapD PIP26DX + SUCCOA −>COA + NS2A6O 1 Succinyl diaminopimelate aminotransferase dapC NS2A6O +GLU <−> AKG + NS26DP 0 Succinyl diaminopimelate desuccinylase dapENS26DP −> SUCC + D26PIM 1 Diaminopimelate epimerase dapF D26PIM <−> MDAP1 Diaminopimelate decarboxylase lysA MDAP −> CO2 + LYS 1 Lysinedecarboxylase 1 cadA LYS −> CO2 + CADV 1 Lysine decarboxylase 2 ldcC LYS−> CO2 + CADV 1 Homoserine transsuccinylase metA HSER + SUCCOA −> COA +OSLHSER 1 O-succinlyhomoserine lyase metB OSLHSER + CYS −> SUCC + LLCT 1Cystathionine-β-lyase metC LLCT −> HCYS + PYR + NH3 1 Adenosylhomocysteinase (Unknown) Unknown HCYS + ADN <−> SAH 0Cobalamin-dependent methionine synthase metH HCYS + MTHF −> MET + THF 1Cobalamin-independent methionine synthase metE HCYS + MTHF −> MET + THF1 5-Adenosylmethionine synthetase metK MET + ATP −> PPI + PI + SAM 1D-Amino acid dehydrogenase dadA DALA + FAD −> FADH + PYR + NH3 1Putrescine transaminase pat PTRC + AKG −> GABAL + GLU 0 Amino oxidasetynA PTRC −> GABAL + NH3 1 Aminobutyraldehyde dehydrogenase prr GABAL +NAD −> GABA + NADH 0 Aldehyde dehydrogenase aldH GABAL + NAD −> GABA +NADH 1 Aminobutyrate aminotransaminase 1 gabT GABA + AKG −> SUCCSAL +GLU 1 Aminobutyrate aminotransaminase 2 goaG GABA + AKG −> SUCCSAL + GLU1 Succinate semialdehyde dehydrogenase-NAD sad SUCCSAL + NAD −> SUCC +NADH 0 Succinate semialdehyde dehydrogenase-NADP gabD SUCCSAL + NADP −>SUCC + NADPH 1 Asparininase I ansA ASN −> ASP + NH3 1 Asparininase IIansB ASN −> ASP + NH3 1 Aspartate ammonia-lyase aspA ASP −> FUM + NH3 1Tryptophanase tnaA CYS −> PYR + NH3 + H2S 1 L-Cysteine desulfhydrase CYS−> PYR + NH3 + H2S 0 Glutamate decarboxylase A gadA GLU −> GABA + CO2 1Glutamate decarboxylase B gadB GLU −> GABA + CO2 1 Glutaminase A GLN −>GLU + NH3 0 Glutaminase B GLN −> GLU + NH3 0 Proline dehydrogenase putAPRO + FAD −> FADH + GLUGSAL 1 Pyrroline-5-carboxylate dehydrogenase putAGLUGSAL + NAD −> NADH + GLU 1 Serine deaminase 1 sdaA SER −> PYR + NH3 1Serine deaminase 2 sdaB SER −> PYR + NH3 1 Trypothanase tnaA SER −>PYR + NH3 1 D-Serine deaminase dsdA DSER −> PYR + NH3 1 Threoninedehydrogenase tdh THR + NAD −> 2A3O + NADH 1 Amino ketobutyrate ligasekbl 2A3O + COA −> ACCOA + GLY 1 Threonine dehydratase catabolic tdcB THR−> OBUT + NH3 1 Threonine deaminase 1 sdaA THR −> OBUT + NH3 1 Threoninedeaminase 2 sdaB THR −> OBUT + NH3 1 Tryptophanase tnaA TRP <−> INDOLE +PYR + NH3 1 Amidophosphoribosyl transferase purF PRPP + GLN −> PPI +GLU + PRAM 1 Phosphoribosylamine-glycine ligase purD PRAM + ATP + GLY<−> ADP + PI + GAR 1 Phosphoribosylglycinamide formyltransferase purNGAR + FTHF −> THF + FGAR 1 GAR transformylase T purT GAR + FOR + ATP −>ADP + PI + FGAR 1 Phosphoribosylformylglycinamide synthetase purL FGAR +ATP + GLN −> GLU + ADP + PI + FGAM 1 Phosphoribosylformylglycinamidecyclo-ligase purM FGAM + ATP −> ADP + PI + AIR 1Phosphoribosylaminoimidazole carboxylase 1 purK AIR + CO2 + ATP <−>NCAIR + ADP + PI 1 Phosphoribosylaminoimidazole carboxylase 2 purE NCAIR<−> CAIR 1 Phosphoribosylaminoimidazole-succinocarboxamide purC CAIR +ATP + ASP <−> ADP + PI + SAICAR 1 synthetase5′-Phosphoribosyl-4-(N-succinocarboxamide)-5- purB SAICAR <−> FUM +AICAR 1 aminoimidazole lya AICAR transformylase purH AICAR + FTHF <−>THF + PRFICA 1 IMP cyclohydrolase purH PRFICA <−> IMP 1 Adenylosuccinatesynthetase purA IMP + GTP + ASP −> GDP + PI + ASUC 1 Adenylosuccinatelyase purB ASUC <−> FUM + AMP 1 IMP dehydrogenase guaB IMP + NAD −>NADH + XMP 1 GMP synthase guaA XMP + ATP + GLN −> GLU + AMP + PPI + GMP1 GMP reductase guaC GMP + NADPH −> NADP + IMY + NH3 1Aspartate-carbamoyltransferase pyrBI CAP + ASP −> CAASP + PI 2Dihydroorotase pyrC CAASP <−> DOROA 1 Dihydroorotate dehydrogenase pyrDDOROA + Q <−> QH2 + OROA 1 Orotate phosphoribosyl transferase pyrEOROA + PRPP <−> PPI + OMP 1 OMP decarboxylase pyrF OMP −> CO2 + UMP 1CTP synthetase pyrG UTP + GLN + ATP −> GLU + CTP + ADP + PI 1 Adenylatekinase adk ATP + AMP <−> 2 ADP 1 Adenylate kinase adk GTP + AMP <−>ADP + GDP 1 Adenylate kinase adk ITP + AMP <−> ADP + IDP 1 Adenylatekinase adk DAMP + ATP <−> ADP + DADP 1 Guanylate kinase gmk GMP + ATP<−> GDP + ADP 1 Deoxyguanylate kinase gmk DGMP + ATP <−> DGDP + ADP 1Nucleoside-diphosphate kinase ndk GDP + ATP <−> GTP + ADP 1Nucleoside-diphosphate kinase ndk UDP + ATP <−> UTP + ADP 1Nucleoside-diphosphate kinase ndk CDP + ATP <−> CTP + ADP 1Nucleoside-diphosphate kinase ndk DGDP + ATP <−> DGTP + ADP 1Nucleoside-diphosphate kinase ndk DUDP + ATP <−> DUTP + ADP 1Nucleoside-diphosphate kinase ndk DCDP + ATP <−> DCTP + ADP 1Nucleoside-diphosphate kinase ndk DADP + ATP <−> DATP + ADP 1Nucleoside-diphosphate kinase ndk DTDP + ATP <−> DTTP + ADP 1 AMPNucleosidse amn AMP −> AD + R5P 1 Adenosine deaminase add ADN −> INS +NH3 1 Deoxyadenosine deaminase add DA −> DIN + NH3 1 Adenine deaminaseyicP AD −> NH3 + HYXN 1 Inosine kinase gsk INS + ATP −> IMP + ADP 1Guanosine kinase gsk GSN + ATP −> GMP + ADP 1 Adenosine kinase adk ADN +ATP −> AMP + ADP 1 Adenine phosphotyltransferase apt AD + PRPP −> PPI +AMP 1 Xanthine-guanine phosphoribosyltransferase gpt XAN + PRPP −> XMP +PPI 1 Xanthine-guanine phosphoribosyltransferase gpt HYXN + PRPP −>PPI + IMP 1 Hypoxanthine phosphoribosyltransferase hpt HYXN + PRPP −>PPI + IMP 1 Xanthine-guanine phosphoribosyltransferase gpt GN + PRPP −>PPI + GMP 1 Hypoxanthine phosphoribosyltransferase hpt GN + PRPP −>PPI + GMP 1 Xanthosine phosphorylase xapA DIN + PI <−> HYXN + DR1P 1Purine nucleotide phosphorylase deoD DIN + PI <−> HYXN + DR1P 1Xanthosine phosphorylase xapA DA + PI <−> AD + DR1P 1 Purine nucleotidephosphorylase deoD DA + PI <−> AD + DR1P 1 Xanthosine phosphorylase xapADG + PI <−> GN + DR1P 1 Purine nucleotide phosphorylase deoD DG + PI <−>GN + DR1P 1 Xanthosine phosphorylase xapA HYXN + R1P <−> INS + PI 1Purine nucleotide phosphorylase deoD HYXN + R1P <−> INS + PI 1Xanthosine phosphorylase xapA AD + R1P <−> PI + ADN 1 Purine nucleotidephosphorylase deoD AD + R1P <−> PI + ADN 1 Xanthosine phosphorylase xapAGN + R1P <−> PI + GSN 1 Purine nucleotide phosphorylase deoD GN + R1P<−> PI + GSN 1 Xanthosine phosphorylase xapA XAN + R1P <−> PI + XTSN 1Purine nucleotide phosphorylase deoD XAN + R1P <−> PI + XTSN 1 Uridinephosphorylase udp URI + PI <−> URA + R1P 1 Thymidine (deoxyuridine)phosphorylase deoA DU + PI <−> URA + DR1P 1 Purine nucleotidephosphorylase deoD DU + PI <−> URA + DR1P 1 Thymidine (deoxyuridine)phosphorylase deoA DT + PI <−> THY + DR1P 1 Cytidylate kinase cmkADCMP + ATP <−> ADP + DCDP 1 Cytidylate kinase cmkA CMP + ATP <−> ADP +CDP 1 Cytidylate kinase cmkB DCMP + ATP <−> ADP + DCDP 1 Cytidylatekinase cmkB CMP + ATP <−> ADP + CDP 1 Cytidylate kinase cmkA UMP + ATP<−> ADP + UDP 1 Cytidylate kinase cmkB UMP + ATP <−> ADP + UDP 1 dTMPkinase tmk DTMP + ATP <−> ADP + DTDP 1 Uridylate kinase pyrH UMP + ATP<−> UDP + ADP 1 Uridylate kinase pyrH DUMP + ATP <−> DUDP + ADP 1Thymidine (deoxyuridine) kinase tdk DU + ATP −> DUMP + ADP 1 Uracilphosphoribosyltransferase upp URA + PRPP −> UMP + PPI 1 Cytosinedeaminase codA CYTS −> URA + NH3 1 Uridine kinase udk URI + GTP −> GDP +UMP 1 Cytodine kinase udk CYTD + GTP −> GDP + CMP 1 CMP glycosylase CMP−> CYTS + R5P 0 Cytidine deaminase cdd CYTD −> URI + NH3 1 Thymidine(deoxynridine) kinase tdk DT + ATP −> ADP + DTMP 1 dCTP deaminase dcdDCTP −> DUTP + NH3 1 Cytidine deaminase cdd DC −> NH3 + DU 15′-Nucleotidase ushA DUMP −> DU + PI 1 5′-Nucleotidase ushA DTMP −> DT +PI 1 5′-Nucleotidase ushA DAMP −> DA + PI 1 5′-Nucleotidase ushA DGMP −>DG + PI 1 5′-Nucleotidase ushA DCMP −> DC + PI 1 5′-Nucleotidase ushACMP −> CYTD + PI 1 5′-Nucleotidase ushA AMP −> PI + ADN 15′-Nucleotidase ushA GMP −> PI + GSN 1 5′-Nucleotidase ushA IMP −> PI +INS 1 5′-Nucleotidase ushA XMP −> PI + XTSN 1 5′-Nucleotidase ushA UMP−> PT + URI 1 Ribonucleoside-diphosphate reductase nrdAB ADP + RTHIO −>DADP + OTHIO 2 Ribonucleoside-diphosphate reductase nrdAB GDP + RTHIO −>DGDP + OTHIO 2 Ribonucleoside-triphosphate reductase nrdD ATP + RTHIO −>DATP + OTHIO 1 Ribonucleoside-triphosphate reductase nrdD GTP + RTHIO −>DGTP + OTHIO 1 Ribonucleoside-diphosphate reductase nrdAB CDP + RTHIO −>DCDP + OTHIO 2 Ribonucleoside-diphosphate reductase II nrdEF CDP + RTHIO−> DCDP + OTHIO 2 Ribonucleoside-diphosphate reductase nrdAB UDP + RTHIO−> DUDP + OTHIO 2 Ribonucleoside-triphosphate reductase nrdD CTP + RTHIO−> DCTP + OTHIO 1 Ribonucleoside-triphosphate reductase nrdD UTP + RTHIO−> OTHIO + DUTP 1 dUTP pyrophosphatase dut DUTP −> PPI + DUMP 1Thymidilate synthetase thyA DUMP + METTHF −> DHF + DTMP 1 Nucleosidetriphosphatase mutT GTP −> GSN + 3 PI 1 Nucleoside triphosphatase mutTDGTP −> DG + 3 PI 1 Deoxyguanosinetriphosphate triphophohydrolase dgtDGTP −> DG + 3 PI 1 Deoxyguanosinetriphosphate triphophohydrolase dgtGTP −> GSN + 3 PI 1 Glycine cleavage system (Multi-component system)gcvHTP, IpdA GLY + THF + NAD −> METTHF + NADH + CO2 + NH3 4 Formyltetrahydrofolate deformylase purU FTHF −> FOR + THF 1 Methylenetetrahydrofolate reductase metF METTHF + NADH −> NAD + MTHF 1 MethyleneTHF dehydrogenase folD METTHF + NADP <−> METHF + NADPH 1 Methenyltetrahydrofolate cyclehydrolase folD METHE <−> FTHF 1 Acetyl-CoAcarboxyltransferase accABD ACCOA + ATP + CO2 <−> MALCOA + ADP + PI 3Malonyl-CoA-ACP transacylase fabD MALCOA + ACP <−> MALACP + COA 1Malonyl-ACP decarboxylase fadB MALACP −> ACACP + CO2 1 Acetyl-CoA-ACPtransacylase fabH ACACP + COA <−> ACCOA + ACP 1 Acyltransferase plsGL3P + 0.035 C140ACP + 0.102 C141ACP + 0.717 C160AC 0 CDP-Diacylglycerolsynthetase cdsA PA + CTP <−> CDPDG + PPI 1 CDP-Diacylglycerolpyrophosphatase cdh CDPDG −> CMP + PA 1 Phosphatidylserine synthase pssACDPDG + SER <−> CMP + PS 1 Phosphatidylserine decarboxylase psd PS −>PE + CO2 1 Phosphatidylglycerol phosphate synthase pgsA CDPDG + GL3P <−>CMP + PGP 1 Phosphatidylglycerol phosphate phosphatase A pgpA PGP −>PI + PG 0 Phosphatidylglycerol phosphate phosphatase B pgpB PGP −> PI +PG 1 Cardiolipin synthase cls 2 PG <−> CL + GL 1 Acetyl-CoAC-acetyltransferase atoB 2 ACCOA <−> COA + AACCOA 1Isoprenyl-pyrophosphate synthesis pathway T3P1 + PYR + 2 NADPH + ATP- >IPPP + ADP + 2 NADP + 0 Isoprenyl pyrophosphate isomerase IPPP −> DMPP 0Farnesyl pyrophosphate synthetase ispA DMPP + IPPP −> GPP + PPI 1Geranyltranstransferase ispA GPP + IPPP −> FPP + PPI 1 Octoprenylpyrophosphate synthase (5 reactions) ispB 5 IPPP + FPP −> OPP + 5 PPI 1Undecaprenyl pyrophosphate synthase (8 reactions) 8 IPPP + FPP −> UDPP +8 PPI 0 Chorismate pyruvate-lyase ubiC CHOR −> 4HBZ + PYR 1Hydroxybenzoate octaprenyltransferase ubiA 4HBZ + OPP −> O4HBZ + PPI 1Octaprenyl-hydroxybeuzoate decarboxylase ubiD, ubiX O4HBZ −> CO2 + 2OPPP1 2-Octaprenylphenol hydroxylase ubiB 2OPPP + O2 −> 206H 1 Methylationreaction 2O6H + SAM −> 2OPMP + SAH 0 2-Octaprenyl-6-methoxyphenolhydroxylase ubiH 2OPMP + O2 −> 2OPMB 12-Octaprenyl-6-methoxy-1,4-benzoquinone methylase ubiE 2OPMB + SAM −>2OPMMB + SAH 0 2-Octaprenyl-3-methyl-6-methoxy-1,4- ubiF 2OPMMB + O2 −>2OMHMB 0 benzoquinone hydroxylase 3-Dimethylubiquinone3-methyltransferase ubiG 2OMHMB + SAM −> QH2 + SAH 1 Isochorismatesynthase 1 menF CHOR −> ICHOR 1 α-Ketoglutarate decarboxylase menD AKG +TPP −> SSALTPP + CO2 1 SHCHC synthase menD ICHOR + SSALTPP −> PYR +TPP + SHCHC 1 O-Succinylbenzoate-CoA synthase menC SHCHC −> OSB 1O-Succinylbenzoic acid-CoA ligase menE OSB + ATP + COA −> OSBCOA + AMP +PPI 1 Naphthoate synthase menB OSBCOA −> DHNA + COA 11,4-Dihydroxy-2-naphthoate octaprenyltransferase menA DHNA + OPP −>DMK + PPI + CO2 1 S-Adenosylmethionine-2-DMK methyltransferase menGDMK + SAM −> MK + SAH 1 Isochorismate synthase 2 entC CHOR −> ICHOR 1Isochorismatase entB ICHOR <−> 23DHDHB + PYR 12,3-Dihydo-2,3-dihydroxybenzoate dehydrogenase entA 23DHDHB + NAD <−>23DHB + NADH 1 ATP-dependent activation of 2,3-dihydroxybenzoate entE23DHB + ATP <−> 23DHBA + PPI 1 ATP-dependent serine activating enzymeentF SER + ATP <−> SERA + PPI 1 Enterochelin synthetase entD 3 SERA + 323DHBA −> ENTER + 6 AMP 1 GTP cyclohydrolase II ribA GTP −> D6RP5P +FOR + PPI 1 Pryimidine deaminase ribD D6RP5P −> A6RP5P + NH3 1Pyrimidine reductase ribD A6RP5P + NADPH −> A6RP5P2 + NADP 1 Pyrimidinephosphatase A6RP5P2 −> A6RP + PI 0 3,4 Dihydroxy-2-butanone-4-phosphatesynthase ribB RL5P −> DB4P + FOR 1 6,7-Dimethyl-8-ribityllumazinesynthase ribE DB4P + A6RP −> D8RL + PI 1 Riboflavin synthase ribH 2 D8RL−> RIBFLV + A6RP 1 Riboflavin kinase ribF RIBFLV + ATP −> FMN + ADP 1FAD synthetase ribF FMN + ATP −> FAD + PPI 1 GTP cyclohydrolase I folEGTP −> FOR + AHTD 1 Dihydroneopterin triphosphate pyrophosphorylase ntpAAHTD −> PPI + DHPP 1 Nucleoside triphosphatase mutT AHTD −> DHP + 3 PI 1Dihydroneopterin monophosphate dephosphorylase DHPP −> DHP + PI 0Dihydroneopterin aldolase folB DHP −> AHHMP + GLAL 1 6-Hydroxymethyl-7,8dihydropterin pyrophosphokinase folK AHHMP + ATP −> AMP + AHHMD 1Aminodeoxychorismate synthase pabAB CHOR + GLN −> ADCHOR + GLU 2Aminodeoxychorismate lyase pabC ADCHOR −> PYR + PABA 1 Dihydropteroatesynthase folP PABA + AHHMD −> PPI + DHPT 1 Dihydrofolate synthetase folCDHPT + ATP + GLU −> ADP + PI + DHF 1 Dihydrofolate reductase folA DHF +NADPH −> NADP + THF 1 Ketopentoate hydroxymethyl transferase panBOIVAL + METTHF −> AKP + THF 1 Ketopantoate reductase panE AKP + NADPH −>NADP + PANT 0 Acetohyoxyacid isomeroreductase ilvC AKP + NADPH −> NADP +PANT 1 Aspartate decarboxylase panD ASP −> CO2 + bALA 1Pantoate-β-alanine ligase panC PANT + bALA + ATP −> AMP + PPI + PNTO 1Pantothenate kinase coaA PNTO + ATP −> ADP + 4PPNTO 1Phosphopantothenate-cysteine ligase 4PPNTO + CTP + CYS −> CMP + PPI +4PPNCYS 0 Phosphopantothenate-cysteine decarboxylase 4PPNCYS −> CO2 +4PPNTE 0 Phospho-pantethiene adenylyltransferase 4PPNTE + ATP −> PPI +DPCOA 0 DephosphoCoA kinase DPCOA + ATP −> ADP + COA 0 ACP Synthase acpSCOA −> PAP + ACP 1 Aspartate oxidase nadB ASP + FAD −> FADH + ISUCC 1Quinolate synthase nadA ISUCC + T3P2 −> PI + QA 1 Quinolatephosphoribosyl transferase nadC QA + PRPP −> NAMN + CO2 + PPI 1 NAMNadenylyl transferase nadD NAMN + ATP −> PPI + NAAD 0 NAMN adenylyltransferase nadD NMN + ATP −> NAD + PPI 0 Deamido-NAD ammonia ligasenadE NAAD + ATP + NH3 −> NAD + AMP + PPI 1 NAD kinase nadFG NAD + ATP −>NADP + ADP 0 NADP phosphatase NADP −> NAD + PI 0 DNA ligase lig NAD −>NMN + AMP 1 NMN amidohydrolase pncC NMN −> NAMN + NH3 0 NMNglycohydrolase (cytoplasmic) NMN −> R5P + NAm 0 NAm amidohydrolase pncANAm −> NAC + NH3 0 NAPRTase pncB NAC + PRPP + ATP −> NAMN + PPI + PI +ADP 1 NAD pyrophosphatase pnuE NADxt −> NMNxt + AMPxt 0 NMN permeasepnuC NMNxt −> NMN 1 NMN glycohydrolase (membrane bound) NMNxt −> R5P +NAm 0 Nicotinic acid uptake NACxt −> NAC 0 GSA synthetase hemM GLU + ATP−> GTRNA + AMP + PPI 1 Glutamyl-tRNA synthetase gltX GLU + ATP −>GTRNA + AMP + PPI 1 Glutamyl-tRNA reductase hemA GTRNA + NADPH −> GSA +NADP 1 Glutamate-1-semialdehyde aminotransferase hemL GSA −> ALAV 1Porphobilinogen synthase hemB 8 ALAV −> 4 PBG 1 Hydroxymethylbilanesynthase hemC 4 PBG −> HMB + 4 NH3 1 Uroporphyrinogen III synthase hemDHMB −> UPRG 1 Uroporphyrin-III C-methyltransferase 1 hemX SAM + UPRG −>SAH + PC2 1 Uroporphyrin-Ill C-methyltransferase 2 cysG SAM + UPRG −>SAH + PC2 1 1,3-Dimethyluroporphyrinogen III dehydrogenase cysG PC2 +NAD −> NADH + SHCL 1 Siroheme ferrochelatase cysG SHCL −> SHEME 1Uroporphyrinogen decarboxylase hemE UPRG −> 4 CO2 + CPP 1Coproporphyrinogen oxidase, aerobic hemF O2 + CPP −> 2 CO2 + PPHG 2Protoporphyrinogen oxidase hemG O2 + PPHG −> PPIX 2 Ferrochelatase hemHPPIX −> PTH 1 Heme O synthase cyoE PTH + FPP −> HO + PPI 18-Amino-7-oxononanoate synthase bioF ALA + CHCOA <−> CO2 + COA + AONA 1Adenosylmethionine-8-amino-7-oxononanoate bioA SAM + AONA <−> SAMOB +DANNA 1 aminotransferase Dethiobiotin synthase bioD CO2 + DANNA + ATP<−> DTB + PI + ADP 1 Biotin synthase bioB DTB + CYS <−> BT 1Glutamate-cysteine ligase gshA CYS + GLU + ATP −> GC + PI + ADP 1Glutathione synthase gshB GLY + GC + ATP −> RGT + PI + ADP 1 Glutathionereductase gor NADPH + OGT <−> NADP + RGT 1 thiC protein thiC AIR −> AHM1 HMP kinase thiN AHM + ATP −> AHMP + ADP 0 HMP-phosphate kinase thiDAHMP + ATP −> AHMPP + ADP 0 Hypothetical T3P1 + PYR −> DTP 0 thiGprotein thiG DTP + TYR + CYS −> THZ + HBA + CO2 1 thiE protein thiEDTP + TYR + CYS −> THZ + HBA + CO2 1 thiF protein thiF DTP + TYR + CYS−> THZ + HBA + CO2 1 thiH protein thiH DTP + TYR + CYS −> THZ + HBA +CO2 1 THZ kinase thiM THZ + ATP −> THZP + ADP 0 Thiamin phosphatesynthase thiB THZP + AHMPP −> THMP + PPI 0 Thiamin kinase thiK THMP +ADP <−> THIAMIN + ATP 0 Thiamin phosphate kinase thiL THMP + ATP <−>TPP + ADP 0 Erythrose 4-phosphate dehydrogenase epd E4P + NAD <−> ER4P +NADH 1 Erythronate-4-phosphate dehydrogenase pdxB ER4P + NAD <−> OHB +NADH 1 Hypothetical transaminase/phosphoserine transaminase serC OHB +GLU <−> PHT + AKG 1 Pyridoxal-phosphate biosynthetic proteins pdxJ-pdxApdxAJ PHT + DX5P −> P5P + CO2 2 Pyridoxine 5′-phosphate oxidase pdxHP5P + O2 <−> PL5P + H2O2 1 Threonine synthase thrC PHT −> 4HLT + PI 1Hypothetical Enzyme 4HLT −> PYRDX 0 Pyridoxine kinase pdxK PYRDX + ATP−> P5P + ADP 1 Hypothetical Enzyme P5P −> PYRDX + PI 0 HypotheticalEnzyme PL5P −> PL + PI 0 Pyridoxine kinase pdxK PL + ATP −> PL5P + ADP 1Pyridoxine 5′-phosphate oxidase pdxH PYRDX + O2 <−> PL + H2O2 1Pyridoxine 5′-phosphate oxidase pdxH PL + O2 + NH3 <−> PDLA + H2O2 1Pyridoxine kinase pdxK PDLA + ATP −> PDLA5P + ADP 1 Hypothetical EnzymePDLA5P −> PDLA + PI 0 Pyridoxine 5′-phosphate oxidase pdxH PDLA5P + O2−> PL5P + H2O2 + NH3 1 Serine hydroxymethyltransferase (serinemethylase) glyA PL5P + GLU −> PDLA5P + AKG 1 Serinehydroxymethyltransferase (serile methylase) glyA PL5P + ALA −> PDLA5P +PYR 1 Glutamine fructose-6-phosphate Transaminase glmS F6P + GLN −>GLU + GA6P 1 Phosphoglucosamine mutase glmM GA6P <−> GA1P 0N-Acetylglucosamine-1-phosphate-uridyltransferase glmU UTP + GAlP +ACCOA −> UDPNAG + PPI + COA 1 UDP-N-acetylglucosamine acyltransferaselpxA C140ACP + UDPNAG −> ACP + UDPG2AA 1UDP-3-O-acyl-N-acetylglucosamine deacetylase lpxC UDPG2AA −> UDPG2A + AC1 UDP-3-O-(3-hydroxymyristoyl)glucosamine- lpxD UDPG2A + C140ACP −>ACP + UDPG23A 1 acyltransferase UDP-sugar hydrolase ushA UDPG23A −>UMP + LIPX 1 Lipid A disaccharide synthase lpxB LIPX + UDPG23A −> UDP +DISAC1P 1 Tetraacyldisaccharide 4′ kinase DISAC1P + ATP −> ADP + LIPIV 03-Deoxy-D-manno-octulosonic-acid transferase kdtA LIPIV + CMPKDO −>KDOLIPIV + CMP 1 (KDO transferase) 3-Deoxy-D-manno-octulosonic-acidtransferase kdtA KDOLIPIV + CMPKDO −> K2LIPIV + CMP 1 (KDO transferase)Endotoxin synthase htrB, msbB K2LIPIV + C140ACP + C120ACP −> LIPA + 2ACP 2 3-Deoxy-D-manno-octulosonic-acid 8-phosphate kdsA PEP + A5P −>KDOP + PI 1 synthase 3-Deoxy-D-manno-octulosonic-acid 8-phosphate KDOP−> KDO + PI 0 phosphatase CMP-2-keto-3-deoxyoctonate synthesis kdsBKDO + CTP −> PPI + CMPKDO 1 ADP-L-glycero-D-mannoheptose-6-epimeraselpcA, rfaED S7P + ATP −> ADPHEP + PPI 1 UDP glucose-1-phosphateuridylyltransferase galU, galF G1P + UTP −> PPI + UDPG 2 Ethanolaminephosphotransferase PE + CMP <−> CDPETN + DGR 0 Phosphatidate phosphatasePA −> PI + DGR 0 Diacylglycerol kinase dgkA DGR + ATP −> ADP + PA 1 LPSSynthesis - truncated version of LPS (ref neid) rfaLJIGFC LIPA + 3ADPHEP + 2 UDPG + 2 CDPETN + 3 CMPKDO −> 6UDP-N-acetylglucosamine-enolpyruvate transferase murA UDPNAG + PEP −>UDPNAGEP + PI 1 UDP-N-acetylglucosamine-enolpyruvate dehydrogenase murBUDPNAGEP + NADPH −> UDPNAM + NADP 1 UDP-N-acetylmuramate-alanine ligasemurC UDPNAM + ALA + ATP −> ADP + PI + UDPNAMA 1UDP-N-acetylmuramoylalanine-D-glutamate ligase murD UDPNAMA + DGLU + ATP−> UDPNAMAG + ADP + PI 1 UDP-N-acetylmuramoylalanyl-D-glutamate2,6-diamino- murE UDPNAMAG + ATP + MDAP −> UNAGD + ADP + PI 1 pimelatelig D-Alanine-D-alanine adding enzyme murF UNAGD + ATP + AA −> UNAGDA +ADP + PI 1 Glutamate racemase murI GLU <−> DGLU 1 D-ala:D-ala ligasesddlAB 2 DALA <−> AA 2 Phospho-N-acetylmuramoylpentapeptide transferasemraY UNAGDA −> UMP + PI + UNPTDO 1 N-Acetylglucosaminyl transferase murGUNPTDO + UDPNAG −> UDP + PEPTIDO 1 Arabinose (low affinity) araEARABxt + HEXT <−> ARAB 1 Arabinose (high affinity) araFGH ARABxt + ATP−> ARAB + ADP + PI 3 Dihydroxyacetone DHAxt + PEP −> T3P2 + PYR 0Fructose fruABF FRUxt + PEP −> F1P + PYR 2 Fucose fucP FUCxt + HEXT <−>FUC 1 Galacitol gatABC GLTLxt + PEP −> GLTL1P + PYR 3 Galactose (lowaffinity) galP GLACxt + HEXT −> GLAC 1 Galactose (low affinity) galPGLCxt + HEXT −> GLC 1 Galactose (high affinity) mglABC GLACxt + ATP −>GLAC + ADP + PI 3 Glucitol srlA1A2B GLTxt + PEP −> GLT6P + PYR 3Gluconate gntST GLCNxt + ATP −> GLCN + ADP + PT 1 Glucose ptsG, crrGLCxt + PEP −> G6P + PYR 2 Glycerol glpF GLxt <−> GL 1 Lactose lacYLCTSxt + NEXT <−> LCTS 1 Maltose malX, crr, malE MLTxt + PEP −> MLT6P +PYR 7 Mannitol mtlA, cmtAB MNTxt + PEP −> MNT6P + PYR 3 Mannose manATZ,ptsPA MANxt + PEP −> MAN1P + PYR 6 Melibiose melB MELIxt + HEXT −> MELI1 N-Acetylglucosamine nagE, ptsN NAG + PEP −> NAGP + PYR 2 Rhamnose rhaTRMNxt + ATP −> RMN + ADP + PI 1 Ribose rbsABCD, xylH RIBxt + ATP −>RIB + ADP + PI 5 Sucrose scr SUCxt + PEP −> SUC6P + PYR 0 TrehalosetreAB TRExt + PEP −> TRE6P + PYR 2 Xylose (low affinity) xylE XYLxt +NEXT −> XYL 1 Xylose (high affinity) xylFG, rbsB XYLxt + ATP −> XYL +ADP + PI 3 Alanine cycA ALAxt + ATP −> ALA + ADP + PI 1 ArginineartPMQJI, arg ARGxt + ATP −> ARG + ADP + PI 9 Asparagine (low Affinity)ASNxt + HEXT <−> ASN 0 Asparagine (high Affinity) ASNxt + ATP −> ASN +ADP + PI 0 Aspartate gltP ASPxt + HEXT −> ASP 1 Aspartate gltJKL ASPxt +ATP −> ASP + ADP + PI 3 Branched chain amino acid transport brnQBCAAxt + HEXT <−> BCAA 1 Cysteine not identified CYSxt + ATP −> CYS +ADP + PI 0 D-Alanine cycA DALAxt + ATP −> DALA + ADP + PI 1 D-Alanineglycine permease cycA DALAxt + HEXT <−> DALA 1 D-Alanine glycinepermease cycA DSERxt + HEXT <−> DSER 1 D-Alanine glycine permease cycAGLYxt + HEXT <−> GLY 1 Diaminopimelic acid MDAPxt + ATP −> MDAP + ADP +PI 0 γ-Aminobutyrate transport gabP GABAxt + ATP −> GABA + ADP + PI 1Glutamate gltP GLUxt + HEXT <−> GLU 1 Glutamate gltS GLUxt + HEXT <−>GLU 1 Glutamate gltJKL GLUxt + ATP −> GLU + ADP + PI 3 Glutamine glnHPQGLNxt + ATP −> GLN + ADP + PI 3 Glycine cycA, proVWX GLYxt + ATP −>GLY + ADP + PI 4 Histidine hisJMPQ HISxt + ATP −> HIS + ADP + PI 4Isoleucine livJ ILExt + ATP −> ILE + ADP + PI 1 Leucine livHKM/livFGJLEUxt + ATP −> LEU + ADP + PI 6 Lysine lysP LYSxt + HEXT <−> LYS 1Lysine argT, hisMPQ LYSxt + ATP −> LYS + ADP + PI 4 Lysine/CadaverinecadB LYSxt + ATP −> LYS + ADP + PI 1 Methionine metD METxt + ATP −>MET + ADP + PI 0 Ornithine argT, hisMPQ ORNxt + ATP −> ORN + ADP + PI 4Phenlyalanine aroP/mtr/pheP PHExt + HEXT <−> PHE 3 Proline putP, proPWXPROxt + HEXT <−> PRO 4 Proline cycA, proVW PROxt + ATP −> PRO + ADP + PI4 Putrescine potEFHIG PTRCxt + ATP −> PTRC + ADP + PI 5 Serine sdaCSERxt + HEXT <−> SER 1 Serine cycA SERxt + ATP −> SER + ADP + PI 1Spermidine & putrescine potABCD SPMDxt + ATP −> SPMD + ADP + PI 4Spermidine & putrescine potABCD PTRCxt + ATP −> PTRC + ADP + PI 4Threonine livJ THRxt + ATP −> THR + ADP + PI 1 Threonine tdcC THRxt +HEXT <−> THR 1 Tryptophan tnaB TRPxt + HEXT <−> TRP 1 Tyrosine tyrPTYRxt + HEXT <−> TYR 1 Valine livJ VALxt + ATP −> VAL + ADP + PI 1Dipeptide dppABCDF DIPEPxt + ATP −> DIPEP + ADP + PI 5 OligopeptideoppABCDF OPEPxt + ATP −> OPEP + ADP + PI 5 Peptide sapABD PEPTxt + ATP−> PEPT + ADP + PI 3 Uracil uraA URAxt + HEXT −> URA 1 Nicotinamidemononucleotide transporter pnuC NMNxt + HEXT −> + NMN 1 Cytosine codBCYTSxt + HEXT −> CYTS 1 Adenine purB ADxt + HEXT −> AD 1 Guanine gpt,hpt GNxt <−> GN 2 Hypoxanthine gpt, hpt HYXNxt <−> HYXN 2 XanthosinexapB XTSNxt <−> XTSN 1 Xanthine gpt XANxt <−> XAN 1 G-system nupGADNxt + NEXT −> ADN 1 G-system nupG GSNxt + NEXT −> GSN 1 G-system nupGURIxt + NEXT −> URI 1 G-system nupG CYTDxt + HEXT −> CYTD 1 G-system(transports all nucleosides) nupG INSxt + HEXT −> INS 1 G-system nupGXTSNxt + HEXT −> XTSN 1 G-system nupG DTxt + HEXT −> DT 1 G-system nupGDINxt + HEXT −> DIN 1 G-system nupG DGxt + HEXT −> DG 1 G-system nupGDAxt + HEXT −> DA 1 G-system nupG DCxt + HEXT −> DC 1 G-system nupGDUxt + HEXT −> DU 1 C-system nupC ADNxt + HEXT −> ADN 1 C-system nupCURIxt + HEXT −> UIRI 1 C-system nupC CYTDxt + HEXT −> CYTD 1 C-systemnupC DTxt + HEXT −> DT 1 C-system nupC DAxt + HEXT −> DA 1 C-system nupCDCxt + HEXT −> DC 1 C-system nupC DUxt + HEXT −> DU 1 Nucleosides anddeoxynucleoside tsx ADNxt + HEXT −> ADN 1 Nucleosides anddeoxynucleoside tsx GSNxt + HEXT −> GSN 1 Nucleosides anddeoxynucleoside tsx URIxt + HEXT −> URI 1 Nucleosides anddeoxynucleoside tsx CYTDxt + HEXT −> CYTD 1 Nucleosides anddeoxynucleoside tsx INSxt + HEXT −> INS 1 Nucleosides anddeoxynucleoside tsx XTSNxt + HEXT −> XTSN 1 Nucleosides anddeoxynucleoside tsx DTxt + HEXT −> DT 1 Nucleosides and deoxynucleosidetsx DINxt + HEXT −> DIN 1 Nucleosides and deoxynucleoside tsx DGxt +HEXT −> DG 1 Nucleosides and deoxynucleoside tsx DAxt + HEXT −> DA 1Nucleosides and deoxynucleoside tsx DCxt + HEXT −> DC 1 Nucleosides anddeoxynucleoside tsx DUxt + HEXT −> DU 1 Acetate transport ACxt + HEXT<−> AC 0 Lactate transport LACxt + HEXT <−> LAC 0 L-Lactate lldPLLACxt + HEXT <−> LLAC 1 Formate transport focA FORxt <−> FOR 1 Ethanoltransport ETHxt + HEXT <−> ETH 0 Succinate transport dcuAB SUCCxt + HEXT<−> SUCC 2 Pyruvate transport PYRxt + HEXT <−> PYR 0 Ammonia transportamtB NH3xt + HEXT <−> NH3 1 Potassium transport kdpABC Kxt + ATP −> K +ADP + PI 3 Potassium transport trkAEHG Kxt + HEXT K−> K 3 Sulfatetransport cysPTUWAZ, s SLFxt + ATP −> SLF + ADP + PI 7 Phosphatetransport pstABCS PIxt + ATP −> ADP + 2 PI 4 Phosphate transport pitABPIxt + HEXT <−> PI 2 Glycerol-3-phosphate glpT, ugpABCE GL3Pxt + PI −>GL3P 5 Dicarboxylates dcuAB, dctA SUCCxt + HEXT <−> SUCC 3Dicarboxylates dcuAB, dctA FUMxt + HEXT <−> FUM 3 Dicarboxylates dcuAB,dctA MALxt + HEXT <−> MAL 3 Dicarboxylates dcuAB, dctA ASPxt + HEXT <−>ASP 3 Fatty acid transport fadL C140xt −> C140 1 Fatty acid transportfadL C160xt −> C160 1 Fatty acid transport fadL C180xt −> C180 1α-Ketoglutarate kgtP AKGxt + HEXT <−> AKG 1 Na/H antiporter nhaABCNAxt + <−> NA + HEXT 2 Na/H antiporter chaABC NAxt + <−> NA + HEXT 3Pantothenate panF PNTOxt + HEXT <−> PNTO 1 Sialic acid permease nanTSLAxt + ATP −> SLA + ADP + PI 1 Oxygen transport O2xt <−> O2 0 Carbondioxide transport CO2xt <−> CO2 0 Urea transport UREAxt + 2 HEXT <−>UREA 0 ATP drain flux for constant maintanence requirements ATP −> ADP +PI 0 Glyceraldehyde transport gufP GLALxt <−> GLAL 0 Acetaldehydetransport ACALxt <−> ACAL 0

TABLE 2 Comparison of the predicted mutant growth characteristics fromthe gene deletion study to published experimental results with singleand double mutants. Glucose Glycerol Succinate Acetate Gene (in vivo/insilico) (in vivo/in silico) (in vivo/in silico) (in vivo/in silico)aceEF −/+ aceA −/− aceB −/− ackA +/+ acs +/+ acn −/− −/− −/− −/− cyd +/+cyo +/+ eno −/+ −/+ −/− −/− fba −/+ fbp +/+ −/− −/− −/− gap −/− −/− −/−−/− gltA −/− −/− −/− −/− gnd +/+ idh −/− −/− −/− −/− ndh +/+ +/+ nuo +/++/+ pfk −/+ pgi +/+ +/+ pgk −/− −/− −/− −/− pgl +/+ pntAB +/+ +/+ +/++/+ glk +/+ ppc ±/+ −/+ +/+ +/+ pta +/+ pts +/+ pyk +/+ rpi −/− −/− −/−−/− sdhABCD +/+ tpi −/+ −/− −/− −/− unc +/+ −/− −/− zwf +/+ sucAD +/+zwf, pnt +/+ pck, mes −/− −/− pck, pps −/− −/− pgi, zwf −/− pgi, gnd −/−pta, acs −/− tktA, tktB −/− Results are scored as + or − meaning growthor no growth determined from in vivo/in silico data. In 73 of 80 casesthe in silico behavior is the same as the experimentally observedbehavior.

1-23. (canceled)
 24. A memory storing data for access by a softwareprogram being executed by at least one processor, comprising: a genomespecific stoichiometric matrix stored in said memory, said genomespecific stoichiometric matrix storing substrates, products, andstoichiometry for a plurality of metabolic reactions specific to anorganism, wherein at least one of said metabolic reactions correspondsto a potential function of a candidate protein that is encoded by anopen reading frame of the organism's genome and for which a function isnot known.
 25. The memory of claim 24, wherein the potential function isbased on homology of the open reading frame to a nucleotide encoding aprotein of known function in another organism.
 26. The memory of claim24, wherein the potential function is based on homology of an amino acidsequence of the candidate protein to an amino acid sequence of a proteinof known function in another organism.
 27. The memory of claim 24,wherein said memory is selected from the group consisting of: a harddisk, optical memory, Random Access Memory, Read Only Memory and FlashMemory.
 28. The memory of claim 24, wherein said organism is Escherichiacoli.